Feature Transformation and Multivariate Decision Tree Induction

نویسندگان

  • Huan Liu
  • Rudy Setiono
چکیده

Univariate decision trees (UDT's) have inherent problems of replication, repetition, and fragmentation. Multivariate decision trees (MDT's) have been proposed to overcome some of the problems. Close examination of the conventional ways of building MDT's, however, reveals that the fragmentation problem still persists. A novel approach is suggested to minimize the fragmentation problem by separating hy-perplane search from decision tree building. This is achieved by feature transformation. Let the initial feature vector be x, the new feature vector after feature transformation T is y, i.e., y = T (x). We can obtain an MDT by (1) building a UDT on y; and (2) replacing new features y at each node with the combinations of initial features x. We elaborate on the advantages of this approach, the details of T , and why it is expected to perform well. Experiments are conducted in order to connrm the analysis, and results are compared to those of C4.5, OC1, and CART.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Induction of Decision Trees

Decision trees are, besides decision rules, one of the most popular forms of knowledge representation in Knowledge Discovery in Databases process (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996) and implementations of the classical decision tree induction algorithms are included in the majority of data mining systems. A hierarchical structure of a tree-based classifier, where appropriate t...

متن کامل

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

Induction of Multivariate Decision Trees by Using Dipolar Criteria

A new approach to the induction of multivariate decision trees is proposed. A linear decision function (hyper-plane) is used at each non-terminal node of a binary tree for splitting the data. The search strategy is based on the dipolar criterion functions and exploits the basis exchange algorithm as an optimization procedure. The feature selection is used to eliminate redundant and noisy featur...

متن کامل

Anomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors

Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...

متن کامل

A framework for bottom-up induction of oblique decision trees

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998